skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Houshmand, Farzin"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Tensor compilers, essential for generating efficient code for deep learning models across various applications, employ tensor graph rewrites as one of the key optimizations. These rewrites optimize tensor computational graphs with the expectation of preserving semantics for tensors of arbitrary rank and size. Despite this expectation, to the best of our knowledge, there does not exist a fully automated verification system to prove the soundness of these rewrites for tensors of arbitrary rank and size. Previous works, while successful in verifying rewrites with tensors of concrete rank, do not provide guarantees in the unbounded setting. To fill this gap, we introduce TensorRight, the first automatic verification system that can verify tensor graph rewrites for input tensors of arbitrary rank and size. We introduce a core language, TensorRight DSL, to represent rewrite rules using a novel axis definition, called aggregated-axis, which allows us to reason about an unbounded number of axes. We achieve unbounded verification by proving that there exists a bound on tensor ranks, under which bounded verification of all instances implies the correctness of the rewrite rule in the unbounded setting. We derive an algorithm to compute this rank using the denotational semantics of TensorRight DSL. TensorRight employs this algorithm to generate a finite number of bounded-verification proof obligations, which are then dispatched to an SMT solver using symbolic execution to automatically verify the correctness of the rewrite rules. We evaluate TensorRight’s verification capabilities by implementing rewrite rules present in XLA’s algebraic simplifier. The results demonstrate that TensorRight can prove the correctness of 115 out of 175 rules in their full generality, while the closest automatic, bounded-verification system can express only 18 of these rules. 
    more » « less
    Free, publicly-accessible full text available January 22, 2026
  2. Tensor compilers, essential for generating efficient code for deep learning models across various applications, employ tensor graph rewrites as one of the key optimizations. These rewrites optimize tensor computational graphs with the expectation of preserving semantics for tensors of arbitrary rank and size. Despite this expectation, to the best of our knowledge, there does not exist a fully automated verification system to prove the soundness of these rewrites for tensors of arbitrary rank and size. Previous works, while successful in verifying rewrites with tensors of concrete rank, do not provide guarantees in the unbounded setting. To fill this gap, we introduce TensorRight, the first automatic verification system that can verify tensor graph rewrites for input tensors of arbitrary rank and size. We introduce a core language, TensorRight DSL, to represent rewrite rules using a novel axis definition, calledaggregated-axis, which allows us to reason about an unbounded number of axes. We achieve unbounded verification by proving that there exists a bound on tensor ranks, under which bounded verification of all instances implies the correctness of the rewrite rule in the unbounded setting. We derive an algorithm to compute this rank using the denotational semantics of TensorRight DSL. TensorRight employs this algorithm to generate a finite number of bounded-verification proof obligations, which are then dispatched to an SMT solver using symbolic execution to automatically verify the correctness of the rewrite rules. We evaluate TensorRight’s verification capabilities by implementing rewrite rules present in XLA’s algebraic simplifier. The results demonstrate that TensorRight can prove the correctness of 115 out of 175 rules in their full generality, while the closest automatic,bounded-verification system can express only 18 of these rules. 
    more » « less
    Free, publicly-accessible full text available January 7, 2026
  3. Inter-organizational systems where subsystems with partial trust need to cooperate are common in healthcare, finance and military. In the face of malicious Byzantine attacks, the ultimate goal is to assure end-to-end policies for the three aspects of trustworthiness: confidentiality, integrity and availability. In contrast to confidentiality and integrity, provision and validation of availability has been often sidestepped. This paper guarantees end-to-end policies simultaneously for all the three aspects of trustworthiness. It presents a security-typed object-based language, a partitioning transformation, an operational semantics, and an information flow type inference system for partitioned and replicated classes. The type system provably guarantees that well-typed methods enjoy noninterference for the three properties, and that their types quantify their resilience to Byzantine attacks. Given a class and the specification of its end-to-end policies, the Hamraz tool applies type inference to automatically place and replicate the fields and methods of the class on Byzantine quorum systems, and synthesize trustworthy-by-construction distributed systems. The experiments show the resiliency of the resulting systems; they can gracefully tolerate attacks that are as strong as the specified policies. 
    more » « less
  4. Data centers are increasingly equipped with RDMAs. These network interfaces mark the advent of a new distributed system model where a node can directly access the remote memory of another. They have enabled microsecond-scale replicated services. The underlying replication protocols of these systems execute all operations under strong consistency. However, strong consistency can hinder response time and availability, and recent replication models have turned to a hybrid of strong and relaxed consistency. This paper presents RDMA replicated data types, the first hybrid replicated data types for the RDMA network model. It presents a novel operational semantics for these types that considers three distinct categories of methods and captures their re- quired coordination, and formally proves that they preserve convergence and integrity. It implements these semantics in a system called Hamband that leverages direct remote accesses to efficiently implement the required coordination protocols. The empirical evaluation shows that Hamband outperforms the throughput of existing message-based and SMR-based implementations by more than 4x. 
    more » « less
  5. Lahiri, Shuvendu K.; Wang, Chao (Ed.)
    Replication is a common technique to build reliable and scalable systems. Traditional strong consistency maintains the same total order of operations across replicas. This total order is the source of multiple desirable consistency properties: integrity, convergence and recency. However, maintaining the total order has proven to inhibit availability and performance. Weaker notions exhibit responsiveness and scalability; however, they forfeit the total order and hence its favorable properties. This project revives these properties with as little coordination as possible. It presents a tool called 𝐻𝑎𝑚𝑝𝑎 that given a sequential object with the declaration of its integrity and recency requirements, automatically synthesizes a correct-by-construction replicated object that simultaneously guarantees the three properties. It features a relational object specification language and a syntax-directed analysis that infers optimum staleness bounds. Further, it defines coordination-avoidance conditions and the operational semantics of replicated systems that provably guarantees the three properties. It characterizes the computational power and presents a protocol for recency-aware objects. 𝐻𝑎𝑚𝑝𝑎 uses automatic solvers statically and embeds them in the runtime to dynamically decide the validity of coordination-avoidance conditions. The experiments show that recency-aware objects reduce coordination and response time. 
    more » « less
  6. Graph analytics elicits insights from large graphs to inform critical decisions for business, safety and security. Several large-scale graph processing frameworks feature efficient runtime systems; however, they often provide programming models that are low-level and subtly different from each other. Therefore, end users can find implementation and specially optimization of graph analytics error-prone and time-consuming. This paper regards the abstract interface of the graph processing frameworks as the instruction set for graph analytics, and presents Grafs, a high-level declarative specification language for graph analytics and a synthesizer that automatically generates efficient code for five high-performance graph processing frameworks. It features novel semantics-preserving fusion transformations that optimize the specifications and reduce them to three primitives: reduction over paths, mapping over vertices and reduction over vertices. Reductions over paths are commonly calculated based on push or pull models that iteratively apply kernel functions at the vertices. This paper presents conditions, parametric in terms of the kernel functions, for the correctness and termination of the iterative models, and uses these conditions as specifications to automatically synthesize the kernel functions. Experimental results show that the generated code matches or outperforms handwritten code, and that fusion accelerates execution. 
    more » « less
  7. null (Ed.)
    The value of cryptocurrencies is highly volatile and investors require fast and reliable exchange systems. In cross-chain transactions, multiple parties exchange assets across multiple blockchains which can be represented as a directed graph with vertexes V as parties and edges E as asset transfers. In a simple form, cross-chain transactions are cross-chain swaps where each edge e transfers an asset that the head of e already owns. However, in general, a cross-chain transaction includes a sequence of exchanges at each blockchain. Further, transactions may have off-chain steps and hence may not be strongly connected. Given a transaction, protocols are desired that guarantee the following property called uniformity. If all parties conform to the protocol, all the assets should be transferred. Further, if any party deviates from the protocol, the conforming parties should not experience any loss. Previous work introduced a uniform protocol for strongly connected cross-chain swaps and showed that no uniform protocol exists for transactions that are not strongly connected. We present a uniform protocol for general cross-chain transactions with sequenced and off-chain steps when a few certain parties are conforming. Further, we prove a new property called end-to-end that guarantees that if the source parties pay, the sink parties are paid. We present a synthesis tool called XCHAIN that given a high-level description of a cross-transaction can automatically generate smart contracts in Solidity for all the parties. 
    more » « less
  8. null (Ed.)